<!DOCTYPE html>

<html>
  <head>
    <meta charset="utf-8">
    
    <title>numpy.lib.format &mdash; NumPy v1.18 Manual</title>
    
    <link rel="stylesheet" type="text/css" href="../../_static/css/spc-bootstrap.css">
    <link rel="stylesheet" type="text/css" href="../../_static/css/spc-extend.css">
    <link rel="stylesheet" href="../../_static/scipy.css" type="text/css" >
    <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" >
    <link rel="stylesheet" href="../../_static/graphviz.css" type="text/css" >
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../../',
        VERSION:     '1.18.1',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  false
      };
    </script>
    <script type="text/javascript" src="../../_static/jquery.js"></script>
    <script type="text/javascript" src="../../_static/underscore.js"></script>
    <script type="text/javascript" src="../../_static/doctools.js"></script>
    <script type="text/javascript" src="../../_static/language_data.js"></script>
    <script type="text/javascript" src="../../_static/js/copybutton.js"></script>
    <link rel="author" title="About these documents" href="../../about.html" >
    <link rel="index" title="Index" href="../../genindex.html" >
    <link rel="search" title="Search" href="../../search.html" >
    <link rel="top" title="NumPy v1.18 Manual" href="../../index.html" >
    <link rel="up" title="Input and output" href="../routines.io.html" >
    <link rel="next" title="Linear algebra (numpy.linalg)" href="../routines.linalg.html" >
    <link rel="prev" title="numpy.DataSource.open" href="numpy.DataSource.open.html" > 
  </head>
  <body>
<div class="container">
  <div class="top-scipy-org-logo-header" style="background-color: #a2bae8;">
    <a href="../../index.html">
      <img border=0 alt="NumPy" src="../../_static/numpy_logo.png"></a>
    </div>
  </div>
</div>


    <div class="container">
      <div class="main">
        
	<div class="row-fluid">
	  <div class="span12">
	    <div class="spc-navbar">
              
    <ul class="nav nav-pills pull-left">
        <li class="active"><a href="https://numpy.org/">NumPy.org</a></li>
        <li class="active"><a href="https://numpy.org/doc">Docs</a></li>
        
        <li class="active"><a href="../../index.html">NumPy v1.18 Manual</a></li>
        

          <li class="active"><a href="../index.html" >NumPy Reference</a></li>
          <li class="active"><a href="../routines.html" >Routines</a></li>
          <li class="active"><a href="../routines.io.html" accesskey="U">Input and output</a></li> 
    </ul>
              
              
    <ul class="nav nav-pills pull-right">
      <li class="active">
        <a href="../../genindex.html" title="General Index"
           accesskey="I">index</a>
      </li>
      <li class="active">
        <a href="../routines.linalg.html" title="Linear algebra (numpy.linalg)"
           accesskey="N">next</a>
      </li>
      <li class="active">
        <a href="numpy.DataSource.open.html" title="numpy.DataSource.open"
           accesskey="P">previous</a>
      </li>
    </ul>
              
	    </div>
	  </div>
	</div>
        

	<div class="row-fluid">
      <div class="spc-rightsidebar span3">
        <div class="sphinxsidebarwrapper">
  <h3><a href="../../contents.html">Table of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">numpy.lib.format</a><ul>
<li><a class="reference internal" href="#npy-format">NPY format</a><ul>
<li><a class="reference internal" href="#capabilities">Capabilities</a></li>
<li><a class="reference internal" href="#limitations">Limitations</a></li>
<li><a class="reference internal" href="#file-extensions">File extensions</a></li>
<li><a class="reference internal" href="#version-numbering">Version numbering</a></li>
<li><a class="reference internal" href="#format-version-1-0">Format Version 1.0</a></li>
<li><a class="reference internal" href="#format-version-2-0">Format Version 2.0</a></li>
<li><a class="reference internal" href="#format-version-3-0">Format Version 3.0</a></li>
<li><a class="reference internal" href="#notes">Notes</a></li>
</ul>
</li>
</ul>
</li>
</ul>

  <h4>Previous topic</h4>
  <p class="topless"><a href="numpy.DataSource.open.html"
                        title="previous chapter">numpy.DataSource.open</a></p>
  <h4>Next topic</h4>
  <p class="topless"><a href="../routines.linalg.html"
                        title="next chapter">Linear algebra (<code class="xref py py-mod docutils literal notranslate"><span class="pre">numpy.linalg</span></code>)</a></p>
<div id="searchbox" style="display: none" role="search">
  <h4>Quick search</h4>
    <div>
    <form class="search" action="../../search.html" method="get">
      <input type="text" style="width: inherit;" name="q" />
      <input type="submit" value="search" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    </div>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
          <div class="span9">
            
        <div class="bodywrapper">
          <div class="body" id="spc-section-body">
            
  <div class="section" id="module-numpy.lib.format">
<span id="numpy-lib-format"></span><h1>numpy.lib.format<a class="headerlink" href="#module-numpy.lib.format" title="Permalink to this headline">¶</a></h1>
<p>Binary serialization</p>
<div class="section" id="npy-format">
<h2>NPY format<a class="headerlink" href="#npy-format" title="Permalink to this headline">¶</a></h2>
<p>A simple format for saving numpy arrays to disk with the full
information about them.</p>
<p>The <code class="docutils literal notranslate"><span class="pre">.npy</span></code> format is the standard binary file format in NumPy for
persisting a <em>single</em> arbitrary NumPy array on disk. The format stores all
of the shape and dtype information necessary to reconstruct the array
correctly even on another machine with a different architecture.
The format is designed to be as simple as possible while achieving
its limited goals.</p>
<p>The <code class="docutils literal notranslate"><span class="pre">.npz</span></code> format is the standard format for persisting <em>multiple</em> NumPy
arrays on disk. A <code class="docutils literal notranslate"><span class="pre">.npz</span></code> file is a zip file containing multiple <code class="docutils literal notranslate"><span class="pre">.npy</span></code>
files, one for each array.</p>
<div class="section" id="capabilities">
<h3>Capabilities<a class="headerlink" href="#capabilities" title="Permalink to this headline">¶</a></h3>
<ul class="simple">
<li><p>Can represent all NumPy arrays including nested record arrays and
object arrays.</p></li>
<li><p>Represents the data in its native binary form.</p></li>
<li><p>Supports Fortran-contiguous arrays directly.</p></li>
<li><p>Stores all of the necessary information to reconstruct the array
including shape and dtype on a machine of a different
architecture.  Both little-endian and big-endian arrays are
supported, and a file with little-endian numbers will yield
a little-endian array on any machine reading the file. The
types are described in terms of their actual sizes. For example,
if a machine with a 64-bit C “long int” writes out an array with
“long ints”, a reading machine with 32-bit C “long ints” will yield
an array with 64-bit integers.</p></li>
<li><p>Is straightforward to reverse engineer. Datasets often live longer than
the programs that created them. A competent developer should be
able to create a solution in their preferred programming language to
read most <code class="docutils literal notranslate"><span class="pre">.npy</span></code> files that he has been given without much
documentation.</p></li>
<li><p>Allows memory-mapping of the data. See <em class="xref py py-obj">open_memmep</em>.</p></li>
<li><p>Can be read from a filelike stream object instead of an actual file.</p></li>
<li><p>Stores object arrays, i.e. arrays containing elements that are arbitrary
Python objects. Files with object arrays are not to be mmapable, but
can be read and written to disk.</p></li>
</ul>
</div>
<div class="section" id="limitations">
<h3>Limitations<a class="headerlink" href="#limitations" title="Permalink to this headline">¶</a></h3>
<ul class="simple">
<li><p>Arbitrary subclasses of numpy.ndarray are not completely preserved.
Subclasses will be accepted for writing, but only the array data will
be written out. A regular numpy.ndarray object will be created
upon reading the file.</p></li>
</ul>
<div class="admonition warning">
<p class="admonition-title">Warning</p>
<p>Due to limitations in the interpretation of structured dtypes, dtypes
with fields with empty names will have the names replaced by ‘f0’, ‘f1’,
etc. Such arrays will not round-trip through the format entirely
accurately. The data is intact; only the field names will differ. We are
working on a fix for this. This fix will not require a change in the
file format. The arrays with such structures can still be saved and
restored, and the correct dtype may be restored by using the
<code class="docutils literal notranslate"><span class="pre">loadedarray.view(correct_dtype)</span></code> method.</p>
</div>
</div>
<div class="section" id="file-extensions">
<h3>File extensions<a class="headerlink" href="#file-extensions" title="Permalink to this headline">¶</a></h3>
<p>We recommend using the <code class="docutils literal notranslate"><span class="pre">.npy</span></code> and <code class="docutils literal notranslate"><span class="pre">.npz</span></code> extensions for files saved
in this format. This is by no means a requirement; applications may wish
to use these file formats but use an extension specific to the
application. In the absence of an obvious alternative, however,
we suggest using <code class="docutils literal notranslate"><span class="pre">.npy</span></code> and <code class="docutils literal notranslate"><span class="pre">.npz</span></code>.</p>
</div>
<div class="section" id="version-numbering">
<h3>Version numbering<a class="headerlink" href="#version-numbering" title="Permalink to this headline">¶</a></h3>
<p>The version numbering of these formats is independent of NumPy version
numbering. If the format is upgraded, the code in <em class="xref py py-obj">numpy.io</em> will still
be able to read and write Version 1.0 files.</p>
</div>
<div class="section" id="format-version-1-0">
<h3>Format Version 1.0<a class="headerlink" href="#format-version-1-0" title="Permalink to this headline">¶</a></h3>
<p>The first 6 bytes are a magic string: exactly <code class="docutils literal notranslate"><span class="pre">\x93NUMPY</span></code>.</p>
<p>The next 1 byte is an unsigned byte: the major version number of the file
format, e.g. <code class="docutils literal notranslate"><span class="pre">\x01</span></code>.</p>
<p>The next 1 byte is an unsigned byte: the minor version number of the file
format, e.g. <code class="docutils literal notranslate"><span class="pre">\x00</span></code>. Note: the version of the file format is not tied
to the version of the numpy package.</p>
<p>The next 2 bytes form a little-endian unsigned short int: the length of
the header data HEADER_LEN.</p>
<p>The next HEADER_LEN bytes form the header data describing the array’s
format. It is an ASCII string which contains a Python literal expression
of a dictionary. It is terminated by a newline (<code class="docutils literal notranslate"><span class="pre">\n</span></code>) and padded with
spaces (<code class="docutils literal notranslate"><span class="pre">\x20</span></code>) to make the total of
<code class="docutils literal notranslate"><span class="pre">len(magic</span> <span class="pre">string)</span> <span class="pre">+</span> <span class="pre">2</span> <span class="pre">+</span> <span class="pre">len(length)</span> <span class="pre">+</span> <span class="pre">HEADER_LEN</span></code> be evenly divisible
by 64 for alignment purposes.</p>
<p>The dictionary contains three keys:</p>
<blockquote>
<div><dl class="simple">
<dt>“descr”<span class="classifier">dtype.descr</span></dt><dd><p>An object that can be passed as an argument to the <a class="reference internal" href="numpy.dtype.html#numpy.dtype" title="numpy.dtype"><code class="xref py py-obj docutils literal notranslate"><span class="pre">numpy.dtype</span></code></a>
constructor to create the array’s dtype.</p>
</dd>
<dt>“fortran_order”<span class="classifier">bool</span></dt><dd><p>Whether the array data is Fortran-contiguous or not. Since
Fortran-contiguous arrays are a common form of non-C-contiguity,
we allow them to be written directly to disk for efficiency.</p>
</dd>
<dt>“shape”<span class="classifier">tuple of int</span></dt><dd><p>The shape of the array.</p>
</dd>
</dl>
</div></blockquote>
<p>For repeatability and readability, the dictionary keys are sorted in
alphabetic order. This is for convenience only. A writer SHOULD implement
this if possible. A reader MUST NOT depend on this.</p>
<p>Following the header comes the array data. If the dtype contains Python
objects (i.e. <code class="docutils literal notranslate"><span class="pre">dtype.hasobject</span> <span class="pre">is</span> <span class="pre">True</span></code>), then the data is a Python
pickle of the array. Otherwise the data is the contiguous (either C-
or Fortran-, depending on <code class="docutils literal notranslate"><span class="pre">fortran_order</span></code>) bytes of the array.
Consumers can figure out the number of bytes by multiplying the number
of elements given by the shape (noting that <code class="docutils literal notranslate"><span class="pre">shape=()</span></code> means there is
1 element) by <code class="docutils literal notranslate"><span class="pre">dtype.itemsize</span></code>.</p>
</div>
<div class="section" id="format-version-2-0">
<h3>Format Version 2.0<a class="headerlink" href="#format-version-2-0" title="Permalink to this headline">¶</a></h3>
<p>The version 1.0 format only allowed the array header to have a total size of
65535 bytes.  This can be exceeded by structured arrays with a large number of
columns.  The version 2.0 format extends the header size to 4 GiB.
<a class="reference internal" href="numpy.save.html#numpy.save" title="numpy.save"><code class="xref py py-obj docutils literal notranslate"><span class="pre">numpy.save</span></code></a> will automatically save in 2.0 format if the data requires it,
else it will always use the more compatible 1.0 format.</p>
<p>The description of the fourth element of the header therefore has become:
“The next 4 bytes form a little-endian unsigned int: the length of the header
data HEADER_LEN.”</p>
</div>
<div class="section" id="format-version-3-0">
<h3>Format Version 3.0<a class="headerlink" href="#format-version-3-0" title="Permalink to this headline">¶</a></h3>
<p>This version replaces the ASCII string (which in practice was latin1) with
a utf8-encoded string, so supports structured types with any unicode field
names.</p>
</div>
<div class="section" id="notes">
<h3>Notes<a class="headerlink" href="#notes" title="Permalink to this headline">¶</a></h3>
<p>The <code class="docutils literal notranslate"><span class="pre">.npy</span></code> format, including motivation for creating it and a comparison of
alternatives, is described in the <a class="reference external" href="https://www.numpy.org/neps/nep-0001-npy-format.html">“npy-format” NEP</a>, however details have
evolved with time and this document is more current.</p>
</div>
</div>
</div>


          </div>
        </div>
          </div>
        </div>
      </div>
    </div>

    <div class="container container-navbar-bottom">
      <div class="spc-navbar">
        
      </div>
    </div>
    <div class="container">
    <div class="footer">
    <div class="row-fluid">
    <ul class="inline pull-left">
      <li>
        &copy; Copyright 2008-2019, The SciPy community.
      </li>
      <li>
      Last updated on Feb 20, 2020.
      </li>
      <li>
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 2.4.2.
      </li>
    </ul>
    </div>
    </div>
    </div>
  </body>
</html>